Shantou
Learning Discrete Latent Variable Structures with Tensor Rank Conditions Zhengming Chen
Unobserved discrete data are ubiquitous in many scientific disciplines, and how to learn the causal structure of these latent variables is crucial for uncovering data patterns. Most studies focus on the linear latent variable model or impose strict constraints on latent structures, which fail to address cases in discrete data involving non-linear relationships or complex latent structures.
Distributed optimization: designed for federated learning
Guo, Wenyou, Qu, Ting, Pan, Chunrong, Huang, George Q.
--Federated Learning (FL), as a distributed collaborative Machine Learning (ML) framework under privacy-preserving constraints, has garnered increasing research attention in cross-organizational data collaboration scenarios. This paper proposes a class of distributed optimization algorithms based on the augmented Lagrangian technique, designed to accommodate diverse communication topologies in both centralized and decentralized FL settings. Furthermore, we develop multiple termination criteria and parameter update mechanisms to enhance computational efficiency, accompanied by rigorous theoretical guarantees of convergence. By generalizing the augmented Lagrangian relaxation through the incorporation of proximal relaxation and quadratic approximation, our framework systematically recovers a broad of classical unconstrained optimization methods, including proximal algorithm, classic gradient descent, and stochastic gradient descent, among others. Notably, the convergence properties of these methods can be naturally derived within the proposed theoretical framework. Numerical experiments demonstrate that the proposed algorithm exhibits strong performance in large-scale settings with significant statistical heterogeneity across clients. Such formulations, commonly referred to as consensus optimization problems, find widespread applications in interdisciplinary domains including distributed ML, collaborative sensing in sensor networks, and distributed parameter estimation [1]. This work was supported in part by the National Natural Science Foundation of China (NSFC) under Grant 52375498, and in part by the Fundamental Research Funds for the Central Universities under Grant 21623111. Ting Qu is with Guangdong International Cooperation Base of Science and Technology for GBA Smart Logistics, Jinan University, Zhuhai 519070, China, also with School of Intelligent Systems Science and Engineering, Jinan University, Zhuhai 519070, China, and also with Institute of Physical Internet, Jinan University, Zhuhai 519070, China (e-mail: quting@jnu.edu.cn).
Unifying Biomedical Vision-Language Expertise: Towards a Generalist Foundation Model via Multi-CLIP Knowledge Distillation
Wang, Shansong, Jin, Zhecheng, Hu, Mingzhe, Safari, Mojtaba, Zhao, Feng, Chang, Chih-Wei, Qiu, Richard LJ, Roper, Justin, Yu, David S., Yang, Xiaofeng
CLIP models pretrained on natural images with billion-scale image-text pairs have demonstrated impressive capabilities in zero-shot classification, cross-modal retrieval, and open-ended visual answering. However, transferring this success to biomedicine is hindered by the scarcity of large-scale biomedical image-text corpora, the heterogeneity of image modalities, and fragmented data standards across institutions. These limitations hinder the development of a unified and generalizable biomedical foundation model trained from scratch. To overcome this, we introduce MMKD-CLIP, a generalist biomedical foundation model developed via Multiple Medical CLIP Knowledge Distillation. Rather than relying on billion-scale raw data, MMKD-CLIP distills knowledge from nine state-of-the-art domain-specific or generalist biomedical CLIP models, each pretrained on millions of biomedical image-text pairs. Our two-stage training pipeline first performs CLIP-style pretraining on over 2.9 million biomedical image-text pairs from 26 image modalities, followed by feature-level distillation using over 19.2 million feature pairs extracted from teacher models. We evaluate MMKD-CLIP on 58 diverse biomedical datasets, encompassing over 10.8 million biomedical images across nine image modalities. The evaluation spans six core task types: zero-shot classification, linear probing, cross-modal retrieval, visual question answering, survival prediction, and cancer diagnosis. MMKD-CLIP consistently outperforms all teacher models while demonstrating remarkable robustness and generalization across image domains and task settings. These results underscore that multi-teacher knowledge distillation is a scalable and effective paradigm for building high-performing biomedical foundation models under the practical constraints of real-world data availability.
Causal View of Time Series Imputation: Some Identification Results on Missing Mechanism
Cai, Ruichu, Zheng, Kaitao, Huang, Junxian, Li, Zijian, Chen, Zhengming, Xu, Boyan, Hao, Zhifeng
Time series imputation is one of the most challenge problems and has broad applications in various fields like health care and the Internet of Things. Existing methods mainly aim to model the temporally latent dependencies and the generation process from the observed time series data. In real-world scenarios, different types of missing mechanisms, like MAR (Missing At Random), and MNAR (Missing Not At Random) can occur in time series data. However, existing methods often overlook the difference among the aforementioned missing mechanisms and use a single model for time series imputation, which can easily lead to misleading results due to mechanism mismatching. In this paper, we propose a framework for time series imputation problem by exploring Different Missing Mechanisms (DMM in short) and tailoring solutions accordingly. Specifically, we first analyze the data generation processes with temporal latent states and missing cause variables for different mechanisms. Sequentially, we model these generation processes via variational inference and estimate prior distributions of latent variables via normalizing flow-based neural architecture. Furthermore, we establish identifiability results under the nonlinear independent component analysis framework to show that latent variables are identifiable. Experimental results show that our method surpasses existing time series imputation techniques across various datasets with different missing mechanisms, demonstrating its effectiveness in real-world applications.
Ψ-Arena: Interactive Assessment and Optimization of LLM-based Psychological Counselors with Tripartite Feedback
Zhu, Shijing, Chen, Zhuang, Bi, Guanqun, Li, Binghang, Deng, Yaxi, Wan, Dazhen, Peng, Libiao, Xiao, Xiyao, Zhang, Rongsheng, Lv, Tangjie, Hu, Zhipeng, Li, FangFang, Huang, Minlie
Large language models (LLMs) have shown promise in providing scalable mental health support, while evaluating their counseling capability remains crucial to ensure both efficacy and safety. Existing evaluations are limited by the static assessment that focuses on knowledge tests, the single perspective that centers on user experience, and the open-loop framework that lacks actionable feedback. To address these issues, we propose Ψ-Arena, an interactive framework for comprehensive assessment and optimization of LLM-based counselors, featuring three key characteristics: (1) Realistic arena interactions that simulate real-world counseling through multi-stage dialogues with psychologically profiled NPC clients, (2) Tripartite evaluation that integrates assessments from the client, counselor, and supervisor perspectives, and (3) Closed-loop optimization that iteratively improves LLM counselors using diagnostic feedback. Experiments across eight state-of-the-art LLMs show significant performance variations in different real-world scenarios and evaluation perspectives. Moreover, reflection-based optimization results in up to a 141% improvement in counseling performance. We hope PsychoArena provides a foundational resource for advancing reliable and human-aligned LLM applications in mental healthcare.
STEI-PCN: an efficient pure convolutional network for traffic prediction via spatial-temporal encoding and inferring
Hu, Kai, Zhao, Zhidan, Hao, Zhifeng
STEI-PCN: an efficient pure convolutional network for traffic prediction via spatial-temporal encoding and inferring Kai Hu a, Zhidan Zhao b,c,, Zhifeng Hao a a Department of Mathematic, School of Mathematics and Computer Sciences, Shantou University, Shantou, 515063, Guangdong, China b Department of Computer Science, School of Mathematics and Computer Sciences, Shantou University, Shantou, 515063, Guangdong, China c Complexity Computation Laboratory, Department of Computer Science, School of Mathematics and Computer Sciences, Shantou University, Shantou, 515603, Guangdong, ChinaAbstract Traffic data exhibits complex temporal, spatial, and spatial-temporal correlations. Capturing and integrating these correlations is crucial for building accurate prediction models. Although numerous deep learning-based traffic prediction models have been developed, most of these models use either independent modules to separately extract temporal and spatial correlations or joint modules to synchronously extract them, without considering the spatial-temporal correlations. Moreover, models that consider joint spatial-temporal correlations (temporal, spatial, and spatial-temporal correlations) often encounter significant challenges in accuracy and computational efficiency which prevent such models from demonstrating the expected advantages of a joint spatial-temporal correlations architecture. To address these issues, this paper proposes an efficient pure convolutional network for traffic prediction via spatial-temporal encoding and inferring (STEI-PCN). The model introduces and designs a dynamic adjacency matrix inferring module based on absolute spatial and temporal coordinates, as well as relative spa-Corresponding author at: Department of Computer Science, School of Mathematics and Computer Sciences, Shantou University, Shantou, 515063, Guangdong, China and Complexity Computation Laboratory, Department of Computer Science, School of Mathematics and Computer Sciences, Shantou University, Shantou, 515603, Guangdong, China.